AITopics

2209.05186

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)

Genre: Research Report (0.81)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

da Silva, Alysson Ribeiro

Double Q-Learning for Citizen Relocation During Natural Hazards

arXiv.org Artificial IntelligenceSep-12-2022

Abstract--Natural disasters can cause substantial negative socio-economic impacts around the world, due to mortality, relocation, rates, and reconstruction decisions. Robotics has been successfully applied to identify and rescue victims during the occurrence of a natural hazard. However, little effort has been taken to deploy solutions where an autonomous robot can save the life of a citizen by itself relocating it, without the need to wait for a rescue team composed of humans. Reinforcement learning approaches can be used to deploy such a solution, however, one of the most famous algorithms to deploy it, the Q-learning, suffers from biased results generated when performing its learning routines. In this research a solution for citizen relocation based on Partially Observable Markov Decision Processes is adopted, where the capability of the Double Q-learning in relocating citizens during a natural hazard is evaluated under a proposed hazard simulation engine based on a grid world. The performance of the solution was measured as a success rate of a citizen relocation procedure, where the results show that the technique portrays a performance above 100% for easy scenarios and near 50% for hard ones.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

2209.038

Country:

North America > United States (0.06)
South America > Brazil > Minas Gerais > Belo Horizonte (0.04)
Asia > Nepal (0.04)
Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.88)

Meng, Lingheng, Gorbet, Rob, Kulić, Dana

Partial Observability during DRL for Robot Control

arXiv.org Artificial IntelligenceSep-11-2022

Deep Reinforcement Learning (DRL) has made tremendous advances in both simulated and real-world robot control tasks in recent years. Nevertheless, applying DRL to novel robot control tasks is still challenging, especially when researchers have to design the action and observation space and the reward function. In this paper, we investigate partial observability as a potential failure source of applying DRL to robot control tasks, which can occur when researchers are not confident whether the observation space fully represents the underlying state. We compare the performance of three common DRL algorithms, TD3, SAC and PPO under various partial observability conditions. We find that TD3 and SAC become easily stuck in local optima and underperform PPO. We propose multi-step versions of the vanilla TD3 and SAC to improve robustness to partial observability based on one-step bootstrapping.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

2209.04999

Country:

North America > Canada > Ontario > Waterloo Region > Waterloo (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > Canada > Ontario > Toronto (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.37)

arXiv.org Artificial IntelligenceSep-11-2022

Inferring Articulated Rigid Body Dynamics from RGBD Video

Heiden, Eric, Liu, Ziang, Vineet, Vibhav, Coumans, Erwin, Sukhatme, Gaurav S.

Being able to reproduce physical phenomena ranging from light interaction to contact mechanics, simulators are becoming increasingly useful in more and more application domains where real-world interaction or labeled data are difficult to obtain. Despite recent progress, significant human effort is needed to configure simulators to accurately reproduce real-world behavior. We introduce a pipeline that combines inverse rendering with differentiable simulation to create digital twins of real-world articulated mechanisms from depth or RGB videos. Our approach automatically discovers joint types and estimates their kinematic parameters, while the dynamic properties of the overall mechanism are tuned to attain physically accurate simulations. Control policies optimized in our derived simulation transfer successfully back to the original system, as we demonstrate on a simulated system. Further, our approach accurately reconstructs the kinematic tree of an articulated mechanism being manipulated by a robot, and highly nonlinear dynamics of a real-world coupled pendulum mechanism. Website: https://eric-heiden.github.io/video2sim

mechanism, rigid body, simulation, (16 more...)

2203.10488

Country: North America > United States > California (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Katuri, Abhiram, Salugu, Sindhu, Tharuni, Gelli, Gouri, Challa Sri

Conversion of Acoustic Signal (Speech) Into Text By Digital Filter using Natural Language Processing

arXiv.org Artificial IntelligenceSep-9-2022

One of the most crucial aspects of communication in daily life is speech recognition. Speech recognition that is based on natural language processing is one of the essential elements in the conversion of one system to another. In this paper, we created an interface that transforms speech and other auditory inputs into text using a digital filter. Contrary to the many methods for this conversion, it is also possible for linguistic faults to appear occasionally, gender recognition, speech recognition that is unsuccessful (cannot recognize voice), and gender recognition to fail. Since technical problems are involved, we developed a program that acts as a mediator to prevent initiating software issues in order to eliminate even this little deviation. Its planned MFCC and HMM are in sync with its AI system. As a result, technical errors have been avoided.

artificial intelligence, machine learning, natural language, (15 more...)

doi: 10.35940/ijeat.A3802.1012122

2209.04189

Country:

Asia > India > Telangana > Hyderabad (0.05)
Asia > India > Odisha (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.53)

On the Convergence of Monte Carlo UCB for Random-Length Episodic MDPs

Dong, Zixuan, Wang, Che, Ross, Keith

In reinforcement learning, Monte Carlo algorithms update the Q function by averaging the episodic returns. In the Monte Carlo UCB (MC-UCB) algorithm, the action taken in each state is the action that maximizes the Q function plus a UCB exploration term, which biases the choice of actions to those that have been chosen less frequently. Although there has been significant work on establishing regret bounds for MC-UCB, most of that work has been focused on finite-horizon versions of the problem, for which each episode terminates after a constant number of steps. For such finite-horizon problems, the optimal policy depends both on the current state and the time within the episode. However, for many natural episodic problems, such as games like Go and Chess and robotic tasks, the episode is of random length and the optimal policy is stationary. For such environments, it is an open question whether the Q-function in MC-UCB will converge to the optimal Q function; we conjecture that, unlike Q-learning, it does not converge for all MDPs. We nevertheless show that for a large class of MDPs, which includes stochastic MDPs such as blackjack and deterministic MDPs such as Go, the Q-function in MC-UCB converges almost surely to the optimal Q function. An immediate corollary of this result is that it also converges almost surely for all finite-horizon MDPs. We also provide numerical experiments, providing further insights into MC-UCB.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

2209.02864

Country:

North America > United States > New York (0.04)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
(3 more...)

Genre: Research Report (0.50)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Gu, Jiayuan, Chaplot, Devendra Singh, Su, Hao, Malik, Jitendra

Multi-skill Mobile Manipulation for Object Rearrangement

We study a modular approach to tackle long-horizon mobile manipulation tasks for object rearrangement, which decomposes a full task into a sequence of subtasks. To tackle the entire task, prior work chains multiple stationary manipulation skills with a point-goal navigation skill, which are learned individually on subtasks. Although more effective than monolithic end-to-end RL policies, this framework suffers from compounding errors in skill chaining, e.g., navigating to a bad location where a stationary manipulation skill can not reach its target to manipulate. To this end, we propose that the manipulation skills should include mobility to have flexibility in interacting with the target object from multiple locations and at the same time the navigation skill could have multiple end points which lead to successful manipulation. We operationalize these ideas by implementing mobile manipulation skills rather than stationary ones and training a navigation skill trained with region goal instead of point goal. We evaluate our multi-skill mobile manipulation method M3 on 3 challenging long-horizon mobile manipulation tasks in the Home Assistant Benchmark (HAB), and show superior performance as compared to the baselines.

fridge, manipulation skill, subtask, (15 more...)

2209.02778

Country: North America > United States > California > San Diego County > San Diego (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.46)

Das, Spandan, Prabhakar, Pavithra

Bayesian Statistical Model Checking for Multi-agent Systems using HyperPCTL*

In this paper, we present a Bayesian method for statistical model checking (SMC) of probabilistic hyperproperties specified in the logic HyperPCTL* on discrete-time Markov chains (DTMCs). While SMC of HyperPCTL* using sequential probability ratio test (SPRT) has been explored before, we develop an alternative SMC algorithm based on Bayesian hypothesis testing. In comparison to PCTL*, verifying HyperPCTL* formulae is complex owing to their simultaneous interpretation on multiple paths of the DTMC. In addition, extending the bottom-up model-checking algorithm of the non-probabilistic setting is not straight forward due to the fact that SMC does not return exact answers to the satisfiability problems of subformulae, instead, it only returns correct answers with high-confidence. We propose a recursive algorithm for SMC of HyperPCTL* based on a modified Bayes' test that factors in the uncertainty in the recursive satisfiability results. We have implemented our algorithm in a Python toolbox, HyProVer, and compared our approach with the SPRT based SMC. Our experimental evaluation demonstrates that our Bayesian SMC algorithm performs better both in terms of the verification time and the number of samples required to deduce satisfiability of a given HyperPCTL* formula.

algorithm, formula, probability, (15 more...)

2209.02672

Country:

North America > United States > Kansas > Riley County > Manhattan (0.14)
Asia > Middle East > Israel > Haifa District > Haifa (0.04)

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
(2 more...)

Uma, Alexandra N., Sityaev, Dmitry

Comparing Methods for Extractive Summarization of Call Centre Dialogue

This paper provides results of evaluating some text summarisation techniques for the purpose of producing call summaries for contact centre solutions. We specifically focus on extractive summarisation methods, as they do not require any labelled data and are fairly quick and easy to implement for production use. We experimentally compare several such methods by using them to produce summaries of calls, and evaluating these summaries objectively (using ROUGE-L) and subjectively (by aggregating the judgements of several annotators). We found that TopicSum and Lead-N outperform the other summarisation methods, whilst BERTSum received comparatively lower scores in both subjective and objective evaluations. The results demonstrate that even such simple heuristics-based methods like Lead-N ca n produce meaningful and useful summaries of call centre dialogues.

computer science & information technology, evaluation, summarisation, (14 more...)

2209.02472

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Slovakia > Bratislava > Bratislava (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Overview (1.00)
Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

arXiv.org Artificial IntelligenceSep-5-2022

Understanding the Behavior of Belief Propagation

Knoll, Christian

Probabilistic graphical models are a powerful concept for modeling high-dimensional distributions. Besides modeling distributions, probabilistic graphical models also provide an elegant framework for performing statistical inference; because of the high-dimensional nature, however, one must often use approximate methods for this purpose. Belief propagation performs approximate inference, is efficient, and looks back on a long success-story. Yet, in most cases, belief propagation lacks any performance and convergence guarantees. Many realistic problems are presented by graphical models with loops, however, in which case belief propagation is neither guaranteed to provide accurate estimates nor that it converges at all. This thesis investigates how the model parameters influence the performance of belief propagation. We are particularly interested in their influence on (i) the number of fixed points, (ii) the convergence properties, and (iii) the approximation quality.

artificial intelligence, belief revision, machine learning, (22 more...)

2209.05464

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Belief Revision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.67)