AITopics

2209.00137

Country:

North America > United States > California (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Beyond Greedy Search: Tracking by Multi-Agent Reinforcement Learning-based Beam Search

Wang, Xiao, Chen, Zhe, Jiang, Bo, Tang, Jin, Luo, Bin, Tao, Dacheng

To track the target in a video, current visual trackers usually adopt greedy search for target object localization in each frame, that is, the candidate region with the maximum response score will be selected as the tracking result of each frame. However, we found that this may be not an optimal choice, especially when encountering challenging tracking scenarios such as heavy occlusion and fast motion. To address this issue, we propose to maintain multiple tracking trajectories and apply beam search strategy for visual tracking, so that the trajectory with fewer accumulated errors can be identified. Accordingly, this paper introduces a novel multi-agent reinforcement learning based beam search tracking strategy, termed BeamTracking. It is mainly inspired by the image captioning task, which takes an image as input and generates diverse descriptions using beam search algorithm. Accordingly, we formulate the tracking as a sample selection problem fulfilled by multiple parallel decision-making processes, each of which aims at picking out one sample as their tracking result in each frame. Each maintained trajectory is associated with an agent to perform the decision-making and determine what actions should be taken to update related information. When all the frames are processed, we select the trajectory with the maximum accumulated score as the tracking result. Extensive experiments on seven popular tracking benchmark datasets validated the effectiveness of the proposed algorithm.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

doi: 10.1109/TIP.2022.3208437

2205.09676

Country:

Asia > China > Anhui Province > Hefei (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
Europe > United Kingdom (0.04)
(2 more...)

Genre: Overview (0.93)

Industry: Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(2 more...)

Effective Multi-User Delay-Constrained Scheduling with Deep Recurrent Reinforcement Learning

Hu, Pihe, Pan, Ling, Chen, Yu, Fang, Zhixuan, Huang, Longbo

Multi-user delay constrained scheduling is important in many real-world applications including wireless communication, live streaming, and cloud computing. Yet, it poses a critical challenge since the scheduler needs to make real-time decisions to guarantee the delay and resource constraints simultaneously without prior information of system dynamics, which can be time-varying and hard to estimate. Moreover, many practical scenarios suffer from partial observability issues, e.g., due to sensing noise or hidden correlation. To tackle these challenges, we propose a deep reinforcement learning (DRL) algorithm, named Recurrent Softmax Delayed Deep Double Deterministic Policy Gradient ($\mathtt{RSD4}$), which is a data-driven method based on a Partially Observed Markov Decision Process (POMDP) formulation. $\mathtt{RSD4}$ guarantees resource and delay constraints by Lagrangian dual and delay-sensitive queues, respectively. It also efficiently tackles partial observability with a memory mechanism enabled by the recurrent neural network (RNN) and introduces user-level decomposition and node-level merging to ensure scalability. Extensive experiments on simulated/real-world datasets demonstrate that $\mathtt{RSD4}$ is robust to system dynamics and partially observable environments, and achieves superior performances over existing DRL and non-DRL-based methods.

algorithm, rsd4, scheduling, (15 more...)

doi: 10.1145/3492866.3549712

2208.14074

Country:

Asia > China > Guangdong Province > Shenzhen (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (1.00)

Industry:

Telecommunications (0.67)
Transportation (0.46)
Leisure & Entertainment (0.46)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.56)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Babu, Nithin, Popovski, Petar, Papadias, Constantinos B.

Cost-Efficient Deployment of a Reliable Multi-UAV Unmanned Aerial System

In this work, we study the trade-off between the reliability and the investment cost of an unmanned aerial system (UAS) consisting of a set of unmanned aerial vehicles (UAVs) carrying radio access nodes, called portable access points (PAPs)), deployed to serve a set of ground nodes (GNs). Using the proposed algorithm, a given geographical region is equivalently represented as a set of circular regions, where each circle represents the coverage region of a PAP. Then, the steady-state availability of the UAS is analytically derived by modelling it as a continuous time birth-death Markov decision process (MDP). Numerical evaluations show that the investment cost to guarantee a given steady-state availability to a set of GNs can be reduced by considering the traffic demand and distribution of GNs.

availability, investment cost, pap, (15 more...)

2208.14503

Country:

Europe > Greece (0.05)
Europe > Denmark > North Jutland > Aalborg (0.04)
Asia > Middle East > Oman (0.04)

Genre: Research Report (0.40)

Industry:

Aerospace & Defense (0.91)
Transportation > Infrastructure & Services (0.61)
Transportation > Air (0.61)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.35)

Discriminative Learning of Similarity and Group Equivariant Representations

Trivedi, Shubhendu

artificial intelligence, metric learning problem, neural information processing system, (15 more...)

One of the most fundamental problems in machine learning is to compare examples: Given a pair of objects we want to return a value which indicates degree of (dis)similarity. Similarity is often task specific, and pre-defined distances can perform poorly, leading to work in metric learning. However, being able to learn a similarity-sensitive distance function also presupposes access to a rich, discriminative representation for the objects at hand. In this dissertation we present contributions towards both ends. In the first part of the thesis, assuming good representations for the data, we present a formulation for metric learning that makes a more direct attempt to optimize for the k-NN accuracy as compared to prior work. We also present extensions of this formulation to metric learning for kNN regression, asymmetric similarity learning and discriminative learning of Hamming distance. In the second part, we consider a situation where we are on a limited computational budget i.e. optimizing over a space of possible metrics would be infeasible, but access to a label aware distance metric is still desirable. We present a simple, and computationally inexpensive approach for estimating a well motivated metric that relies only on gradient estimates, discussing theoretical and experimental results. In the final part, we address representational issues, considering group equivariant convolutional neural networks (GCNNs). Equivariance to symmetry transformations is explicitly encoded in GCNNs; a classical CNN being the simplest example. In particular, we present a SO(3)-equivariant neural network architecture for spherical data, that operates entirely in Fourier space, while also providing a formalism for the design of fully Fourier neural networks that are equivariant to the action of any continuous compact group.

1808.10078

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(12 more...)

Genre:

Research Report > New Finding (1.00)
Instructional Material > Course Syllabus & Notes (1.00)
Summary/Review (0.87)

Industry:

Health & Medicine (0.67)
Education > Educational Setting > Higher Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.67)

arXiv.org Artificial IntelligenceAug-29-2022

Decentralized Coordination in Partially Observable Queueing Networks

Jia, Jiekai, Tahir, Anam, Koeppl, Heinz

We consider communication in a fully cooperative multi-agent system, where the agents have partial observation of the environment and must act jointly to maximize the overall reward. We have a discrete-time queueing network where agents route packets to queues based only on the partial information of the current queue lengths. The queues have limited buffer capacity, so packet drops happen when they are sent to a full queue. In this work, we implemented a communication channel for the agents to share their information in order to reduce the packet drop rate. For efficient information sharing we use an attention-based communication model, called ATVC, to select informative messages from other agents. The agents then infer the state of queues using a combination of the variational auto-encoder, VAE, and product-of-experts, PoE, model. Ultimately, the agents learn what they need to communicate and with whom, instead of communicating all the time with everyone. We also show empirically that ATVC is able to infer the true state of the queues and leads to a policy which outperforms existing baselines.

agent, queue, scheduler, (13 more...)

2208.13621

Country:

Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.05)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.42)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.35)

Maree, Charl, Omlin, Christian W.

Symbolic Explanation of Affinity-Based Reinforcement Learning Agents with Markov Models

arXiv.org Artificial IntelligenceAug-29-2022

The proliferation of artificial intelligence is increasingly dependent on model understanding. Understanding demands both an interpretation - a human reasoning about a model's behavior - and an explanation - a symbolic representation of the functioning of the model. Notwithstanding the imperative of transparency for safety, trust, and acceptance, the opacity of state-of-the-art reinforcement learning algorithms conceals the rudiments of their learned strategies. We have developed a policy regularization method that asserts the global intrinsic affinities of learned strategies. These affinities provide a means of reasoning about a policy's behavior, thus making it inherently interpretable. We have demonstrated our method in personalized prosperity management where individuals' spending behavior in time dictate their investment strategies, i.e. distinct spending personalities may have dissimilar associations with different investment classes. We now explain our model by reproducing the underlying prototypical policies with discretized Markov models. These global surrogates are symbolic representations of the prototypical policies.

agent, markov model, reinforcement, (15 more...)

2208.12627

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Spain > Aragón (0.04)
Europe > Norway > Western Norway > Rogaland > Stavanger (0.04)

Genre: Research Report (0.64)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Bakirtzis, Georgios, Savvas, Michail, Topcu, Ufuk

Categorical semantics of compositional reinforcement learning

arXiv.org Artificial IntelligenceAug-29-2022

Reinforcement learning (RL) often requires decomposing a problem into subtasks and composing learned behaviors on these tasks. Compositionality in RL has the potential to create modular subtask units that interface with other system capabilities. However, generating compositional models requires the characterization of minimal assumptions for the robustness of the compositional feature. We develop a framework for a \emph{compositional theory} of RL using a categorical point of view. Given the categorical representation of compositionality, we investigate sufficient conditions under which learning-by-parts results in the same optimal policy as learning on the whole. In particular, our approach introduces a category $\mathsf{MDP}$, whose objects are Markov decision processes (MDPs) acting as models of tasks. We show that $\mathsf{MDP}$ admits natural compositional operations, such as certain fiber products and pushouts. These operations make explicit compositional phenomena in RL and unify existing constructions, such as puncturing hazardous states in composite MDPs and incorporating state-action symmetry. We also model sequential task completion by introducing the language of zig-zag diagrams that is an immediate application of the pushout operation in $\mathsf{MDP}$.

category, diagram, mdp, (16 more...)

2208.13687

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Garg, Muskan, Aggarwal, Naveen

Minimal Feature Analysis for Isolated Digit Recognition for varying encoding rates in noisy environments

arXiv.org Artificial IntelligenceAug-27-2022

This research work is about recent development made in speech recognition. In this research work, analysis of isolated digit recognition in the presence of different bit rates and at different noise levels has been performed. This research work has been carried using audacity and HTK toolkit. Hidden Markov Model (HMM) is the recognition model which was used to perform this experiment. The feature extraction techniques used are Mel Frequency Cepstrum coefficient (MFCC), Linear Predictive Coding (LPC), perceptual linear predictive (PLP), mel spectrum (MELSPEC), filter bank (FBANK). There were three types of different noise levels which have been considered for testing of data. These include random noise, fan noise and random noise in real time environment. This was done to analyse the best environment which can used for real time applications. Further, five different types of commonly used bit rates at different sampling rates were considered to find out the most optimum bit rate.

bit rate, isolated digit recognition, recognition, (10 more...)

2208.131

Country:

Asia > India (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.91)

Kahl, Matthias, Jorde, Daniel, Jacobsen, Hans-Arno

Representation Learning for Appliance Recognition: A Comparison to Classical Machine Learning

arXiv.org Artificial IntelligenceAug-26-2022

Non-intrusive load monitoring (NILM) aims at energy consumption and appliance state information retrieval from aggregated consumption measurements, with the help of signal processing and machine learning algorithms. Representation learning with deep neural networks is successfully applied to several related disciplines. The main advantage of representation learning lies in replacing an expert-driven, hand-crafted feature extraction with hierarchical learning from many representations in raw data format. In this paper, we show how the NILM processing-chain can be improved, reduced in complexity and alternatively designed with recent deep learning algorithms. On the basis of an event-based appliance recognition approach, we evaluate seven different classification models: a classical machine learning approach that is based on a hand-crafted feature extraction, three different deep neural network architectures for automated feature extraction on raw waveform data, as well as three baseline approaches for raw data processing. We evaluate all approaches on two large-scale energy consumption datasets with more than 50,000 events of 44 appliances. We show that with the use of deep learning, we are able to reach and surpass the performance of the state-of-the-art classical machine learning approach for appliance recognition with an F-Score of 0.75 and 0.86 compared to 0.69 and 0.87 of the classical approach.

appliance, artificial intelligence, deep learning, (13 more...)

2209.03759

Country:

Europe > United Kingdom (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report (0.82)

Industry: Energy > Power Industry (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)