AITopics | Undirected Networks

Collaborating Authors

Undirected Networks

News Overviews Instructional Materials AI-Alerts Classics

Learning to reflect: A unifying approach for data-driven stochastic control strategies

Christensen, Sören, Strauch, Claudia, Trottner, Lukas

arXiv.org Machine LearningApr-23-2021

Stochastic optimal control problems have a long tradition in applied probability, with the questions addressed being of high relevance in a multitude of fields. Even though theoretical solutions are well understood in many scenarios, their practicability suffers from the assumption of known dynamics of the underlying stochastic process, raising the statistical challenge of developing purely data-driven strategies. For the mathematically separated classes of continuous diffusion processes and L\'evy processes, we show that developing efficient strategies for related singular stochastic control problems can essentially be reduced to finding rate-optimal estimators with respect to the sup-norm risk of objects associated to the invariant distribution of ergodic processes which determine the theoretical solution of the control problem. From a statistical perspective, we exploit the exponential $\beta$-mixing property as the common factor of both scenarios to drive the convergence analysis, indicating that relying on general stability properties of Markov processes is a sufficiently powerful and flexible approach to treat complex applications requiring statistical methods. We show moreover that in the L\'evy case $-$ even though per se jump processes are more difficult to handle both in statistics and control theory $-$ a fully data-driven strategy with regret of significantly better order than in the diffusion case can be constructed.

control problem, estimator, vy process, (16 more...)

arXiv.org Machine Learning

2104.11496

Country:

North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Macao (0.04)
(3 more...)

Genre: Research Report (0.63)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.49)

Add feedback

DRL: Deep Reinforcement Learning for Intelligent Robot Control -- Concept, Literature, and Future

Dargazany, Aras

arXiv.org Artificial IntelligenceApr-20-2021

Combination of machine learning (for generating machine intelligence), computer vision (for better environment perception), and robotic systems (for controlled environment interaction) motivates this work toward proposing a vision-based learning framework for intelligent robot control as the ultimate goal (vision-based learning robot). This work specifically introduces deep reinforcement learning as the the learning framework, a General-purpose framework for AI (AGI) meaning application-independent and platform-independent. In terms of robot control, this framework is proposing specifically a high-level control architecture independent of the low-level control, meaning these two required level of control can be developed separately from each other. In this aspect, the high-level control creates the required intelligence for the control of the platform using the recorded low-level controlling data from that same platform generated by a trainer. The recorded low-level controlling data is simply indicating the successful and failed experiences or sequences of experiments conducted by a trainer using the same robotic platform. The sequences of the recorded data are composed of observation data (input sensor), generated reward (feedback value) and action data (output controller). For experimental platform and experiments, vision sensors are used for perception of the environment, different kinematic controllers create the required motion commands based on the platform application, deep learning approaches generate the required intelligence, and finally reinforcement learning techniques incrementally improve the generated intelligence until the mission is accomplished by the robot.

application, architecture, learning, (14 more...)

arXiv.org Artificial Intelligence

2105.13806

Country:

South America > Brazil > Ceará > Fortaleza (0.04)
North America > United States > California (0.04)
Asia > Japan > Honshū > Kansai > Hyogo Prefecture > Kobe (0.04)

Genre:

Research Report (0.64)
Instructional Material (0.46)

Industry:

Leisure & Entertainment > Games (1.00)
Information Technology (1.00)
Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Add feedback

CVLight: Deep Reinforcement Learning for Adaptive Traffic Signal Control with Connected Vehicles

Li, Wangzhi, Cai, Yaxing, Dinesha, Ujwal, Fu, Yongjie, Di, Xuan

arXiv.org Artificial IntelligenceApr-20-2021

This paper develops a reinforcement learning (RL) scheme for adaptive traffic signal control (ATSC), called "CVLight", that leverages data collected only from connected vehicles (CV). Seven types of RL models are proposed within this scheme that contain various state and reward representations, including incorporation of CV delay and green light duration into state and the usage of CV delay as reward. To further incorporate information of both CV and non-CV into CVLight, an algorithm based on actor-critic, A2C-Full, is proposed where both CV and non-CV information is used to train the critic network, while only CV information is used to update the policy network and execute optimal signal timing. These models are compared at an isolated intersection under various CV market penetration rates. A full model with the best performance (i.e., minimum average travel delay per vehicle) is then selected and applied to compare with state-of-the-art benchmarks under different levels of traffic demands, turning proportions, and dynamic traffic demands, respectively. Two case studies are performed on an isolated intersection and a corridor with three consecutive intersections located in Manhattan, New York, to further demonstrate the effectiveness of the proposed algorithm under real-world scenarios. Compared to other baseline models that use all vehicle information, the trained CVLight agent can efficiently control multiple intersections solely based on CV data and can achieve a similar or even greater performance when the CV penetration rate is no less than 20%.

intersection, penetration rate, vehicle, (16 more...)

arXiv.org Artificial Intelligence

2104.1034

Country:

North America > United States > New York > New York County > Manhattan (0.24)
Europe > Netherlands > North Holland > Amsterdam (0.05)
North America > United States > Tennessee > Anderson County > Oak Ridge (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Network-wide traffic signal control optimization using a multi-agent deep reinforcement learning

Li, Zhenning, Yu, Hao, Zhang, Guohui, Dong, Shangjia, Xu, Cheng-Zhong

arXiv.org Artificial IntelligenceApr-20-2021

Inefficient traffic control may cause numerous problems such as traffic congestion and energy waste. This paper proposes a novel multi-agent reinforcement learning method, named KS-DDPG (Knowledge Sharing Deep Deterministic Policy Gradient) to achieve optimal control by enhancing the cooperation between traffic signals. By introducing the knowledge-sharing enabled communication protocol, each agent can access to the collective representation of the traffic environment collected by all agents. The proposed method is evaluated through two experiments respectively using synthetic and real-world datasets. The comparison with state-of-the-art reinforcement learning-based and conventional transportation methods demonstrate the proposed KS-DDPG has significant efficiency in controlling large-scale transportation networks and coping with fluctuations in traffic flow. In addition, the introduced communication mechanism has also been proven to speed up the convergence of the model without significantly increasing the computational burden.

agent, intersection, knowledge, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.trc.2021.103059

2104.09936

Country:

Asia > Macao (0.14)
North America > United States > Maryland > Montgomery County (0.04)
North America > United States > Hawaii (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report > New Finding (0.93)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.46)

Add feedback

Markov models and Markov chains explained in real life: probabilistic workout routine

#artificialintelligenceApr-19-2021, 15:21:05 GMT

Andrei Markov didn't agree with Pavel Nebrasov, when he said independence between variables was necessary for the Weak Law of Large Numbers to be applied. When you collect independent samples, as the number of samples gets bigger, the mean of those samples converges to the true mean of the population. But Markov believed independence was not a necessary condition for the mean to converge. So he set out to define how the average of the outcomes from a process involving dependent random variables could converge over time. Thanks to this intellectual disagreement, Markov created a way to describe how random, also called stochastic, systems or processes evolve over time.

converge, markov chain, markov model, (9 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

A Simulated Experiment to Explore Robotic Dialogue Strategies for People with Dementia

Yuan, Fengpei, Sadovnik, Amir, Zhang, Ran, Casenhiser, Devin, Paek, Eun Jin, Yoon, Si On, Zhao, Xiaopeng

arXiv.org Artificial IntelligenceApr-18-2021

People with Alzheimer's disease and related dementias (ADRD) often show the problem of repetitive questioning, which brings a great burden on persons with ADRD (PwDs) and their caregivers. Conversational robots hold promise of coping with this problem and hence alleviating the burdens on caregivers. In this paper, we proposed a partially observable markov decision process (POMDP) model for the PwD-robot interaction in the context of repetitive questioning, and used Q-learning to learn an adaptive conversation strategy (i.e., rate of follow-up question and difficulty of follow-up question) towards PwDs with different cognitive capabilities and different engagement levels. The results indicated that Q-learning was helpful for action selection for the robot. This may be a useful step towards the application of conversational social robots to cope with repetitive questioning in PwDs.

engagement level, follow-up question, robot, (16 more...)

arXiv.org Artificial Intelligence

2104.0894

Country:

North America > United States > Tennessee > Knox County > Knoxville (0.14)
North America > United States > Iowa > Johnson County > Iowa City (0.14)
North America > United States > Ohio > Butler County > Oxford (0.04)
Asia > Middle East > Iran (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology > Dementia (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

A Self-Supervised Auxiliary Loss for Deep RL in Partially Observable Settings

Ahmed, Eltayeb, Zintgraf, Luisa, de Witt, Christian A. Schroeder, Usunier, Nicolas

arXiv.org Artificial IntelligenceApr-17-2021

In this work we explore an auxiliary loss useful for reinforcement learning in environments where strong performing agents are required to be able to navigate a spatial environment. The auxiliary loss proposed is to minimize the classification error of a neural network classifier that predicts whether or not a pair of states sampled from the agents current episode trajectory are in order. The classifier takes as input a pair of states as well as the agent's memory. The motivation for this auxiliary loss is that there is a strong correlation with which of a pair of states is more recent in the agents episode trajectory and which of the two states is spatially closer to the agent. Our hypothesis is that learning features to answer this question encourages the agent to learn and internalize in memory representations of states that facilitate spatial reasoning. We tested this auxiliary loss on a navigation task in a gridworld and achieved 9.6% increase in accumulative episode reward compared to a strong baseline approach.

agent, arxiv preprint arxiv, auxiliary loss, (13 more...)

arXiv.org Artificial Intelligence

2104.08492

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.05)
Africa > Rwanda > Kigali > Kigali (0.05)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

Add feedback

Financial Engineering and Artificial Intelligence in Python

#artificialintelligenceApr-16-2021, 05:22:46 GMT

Have you ever thought about what would happen if you combined the power of machine learning and artificial intelligence with financial engineering? Today, you can stop imagining, and start doing. This course will teach you the core fundamentals of financial engineering, with a machine learning twist. We will learn about the greatest flub made in the past decade by marketers posing as "machine learning experts" who promise to teach unsuspecting students how to "predict stock prices with LSTMs". You will learn exactly why their methodology is fundamentally flawed and why their results are complete nonsense.

artificial intelligence, financial engineering and artificial intelligence, machine learning, (5 more...)

#artificialintelligence

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Industry: Banking & Finance > Trading (0.59)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.35)

Add feedback

Extractive Summarization of Call Transcripts

Biswas, Pratik K., Iakubovich, Aleksandr

arXiv.org Artificial IntelligenceApr-15-2021

Text summarization is the process of extracting the most important information from the text and presenting it concisely in fewer sentences. Call transcript is a text that involves textual description of a phone conversation between a customer (caller) and agent(s) (customer representatives). This paper presents an indigenously developed method that combines topic modeling and sentence selection with punctuation restoration in condensing ill-punctuated or un-punctuated call transcripts to produce summaries that are more readable. Extensive testing, evaluation and comparisons have demonstrated the efficacy of this summarizer for call transcript summarization.

machine learning, natural language, transcript, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ACCESS.2022.3221404

2103.10599

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Virginia > Arlington County > Arlington (0.04)
North America > United States > Texas (0.04)
(4 more...)

Genre:

Research Report (0.50)
Overview (0.46)

Industry: Telecommunications (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

Self-Supervised Exploration via Latent Bayesian Surprise

Mazzaglia, Pietro, Catal, Ozan, Verbelen, Tim, Dhoedt, Bart

arXiv.org Artificial IntelligenceApr-15-2021

Training with Reinforcement Learning requires a reward function that is used to guide the agent towards achieving its objective. However, designing smooth and well-behaved rewards is in general not trivial and requires significant human engineering efforts. Generating rewards in self-supervised way, by inspiring the agent with an intrinsic desire to learn and explore the environment, might induce more general behaviours. In this work, we propose a curiosity-based bonus as intrinsic reward for Reinforcement Learning, computed as the Bayesian surprise with respect to a latent state variable, learnt by reconstructing fixed random features. We extensively evaluate our model by measuring the agent's performance in terms of environment exploration, for continuous tasks, and looking at the game scores achieved, for video games. Our model is computationally cheap and empirically shows state-of-the-art performance on several problems. Furthermore, experimenting on an environment with stochastic actions, our approach emerged to be the most resilient to simple stochasticity. Further visualization is available on the project webpage.(https://lbsexploration.github.io/)

exploration, international conference, learning, (15 more...)

arXiv.org Artificial Intelligence

2104.07495

Country:

North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
Europe > Belgium > Flanders > East Flanders > Ghent (0.04)
Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Robots (0.94)
(2 more...)

Add feedback