AITopics

2010.14274

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom (0.04)
North America > United States > New York > New York County > New York City (0.04)
(5 more...)

Genre: Research Report > New Finding (0.68)

Industry: Education (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)

#artificialintelligenceOct-26-2020, 05:27:25 GMT

The Exponential Growth of AI in Brain Care and Treatment

Advances in computer science are helping to accelerate a broad spectrum of scientific research. The more complex the problem, the greater the potential for artificial intelligence (AI) machine learning to help identify patterns and make predictions. How widely is machine learning being used in treating diseases and disorders of the brain? A new study published earlier this month in the science journal APL Bioengineering examines the state-of-the-art uses of AI for brain disease, and shows there has been exponential growth in over a decade. The biological brain has been the inspiration for artificial neural networks, a type of artificial intelligence (AI) machine learning model.

algorithm, artificial intelligence, machine learning, (15 more...)

#artificialintelligence

Country: Europe > Italy > Calabria (0.05)

Genre: Research Report > New Finding (0.36)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.32)

arXiv.org Artificial IntelligenceOct-26-2020

Computing Nash Equilibria in Multiplayer DAG-Structured Stochastic Games with Persistent Imperfect Information

Ganzfried, Sam

Many important real-world settings contain multiple players interacting over an unknown duration with probabilistic state transitions, and are naturally modeled as stochastic games. Prior research on algorithms for stochastic games has focused on two-player zero-sum games, games with perfect information, and games with imperfect-information that is local and does not extend between game states. We present an algorithm for approximating Nash equilibrium in multiplayer general-sum stochastic games with persistent imperfect information that extends throughout game play. We experiment on a 4-player imperfect-information naval strategic planning scenario. Using a new procedure, we are able to demonstrate that our algorithm computes a strategy that closely approximates Nash equilibrium in this game.

algorithm, artificial intelligence, machine learning, (20 more...)

2010.1386

Country:

Pacific Ocean > North Pacific Ocean > South China Sea (0.04)
North America > United States > Texas (0.04)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
Asia > China (0.04)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)

Deshpande, Niranjan, Vaufreydaz, Dominique, Spalanzani, Anne

Behavioral decision-making for urban autonomous driving in the presence of pedestrians using Deep Recurrent Q-Network

arXiv.org Machine LearningOct-26-2020

Decision making for autonomous driving in urban environments is challenging due to the complexity of the road structure and the uncertainty in the behavior of diverse road users. Traditional methods consist of manually designed rules as the driving policy, which require expert domain knowledge, are difficult to generalize and might give sub-optimal results as the environment gets complex. Whereas, using reinforcement learning, optimal driving policy could be learned and improved automatically through several interactions with the environment. However, current research in the field of reinforcement learning for autonomous driving is mainly focused on highway setup with little to no emphasis on urban environments. In this work, a deep reinforcement learning based decision-making approach for high-level driving behavior is proposed for urban environments in the presence of pedestrians. For this, the use of Deep Recurrent Q-Network (DRQN) is explored, a method combining state-of-the art Deep Q-Network (DQN) with a long term short term memory (LSTM) layer helping the agent gain a memory of the environment. A 3-D state representation is designed as the input combined with a well defined reward function to train the agent for learning an appropriate behavior policy in a real-world like urban simulator. The proposed method is evaluated for dense urban scenarios and compared with a rule-based approach and results show that the proposed DRQN based driving behavior decision maker outperforms the rule-based approach.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

2010.13407

Country: Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.06)

Genre: Research Report > New Finding (0.34)

Industry:

Transportation > Ground > Road (1.00)
Automobiles & Trucks (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(2 more...)

Alt, Bastian, Schultheis, Matthias, Koeppl, Heinz

POMDPs in Continuous Time and Discrete Spaces

arXiv.org Machine LearningOct-26-2020

Many processes, such as discrete event systems in engineering or population dynamics in biology, evolve in discrete space and continuous time. We consider the problem of optimal decision making in such discrete state and action space systems under partial observability. This places our work at the intersection of optimal filtering and optimal control. At the current state of research, a mathematical description for simultaneous decision making and filtering in continuous time with finite state and action spaces is still missing. In this paper, we give a mathematical description of a continuous-time partial observable Markov decision process (POMDP). By leveraging optimal filtering theory we derive a Hamilton-Jacobi-Bellman (HJB) type equation that characterizes the optimal solution. Using techniques from deep learning we approximately solve the resulting partial integro-differential equation. We present (i) an approach solving the decision problem offline by learning an approximation of the value function and (ii) an online algorithm which provides a solution in belief space using deep reinforcement learning. We show the applicability on a set of toy examples which pave the way for future methods providing solutions for high dimensional problems.

artificial intelligence, equation, machine learning, (18 more...)

2010.01014

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(6 more...)

Genre: Research Report (0.70)

Industry: Government > Military (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

arXiv.org Artificial IntelligenceOct-26-2020

MELD: Meta-Reinforcement Learning from Images via Latent State Models

Zhao, Tony Z., Nagabandi, Anusha, Rakelly, Kate, Finn, Chelsea, Levine, Sergey

Meta-reinforcement learning algorithms can enable autonomous agents, such as robots, to quickly acquire new behaviors by leveraging prior experience in a set of related training tasks. However, the onerous data requirements of meta-training compounded with the challenge of learning from sensory inputs such as images have made meta-RL challenging to apply to real robotic systems. Latent state models, which learn compact state representations from a sequence of observations, can accelerate representation learning from visual inputs. In this paper, we leverage the perspective of meta-learning as task inference to show that latent state models can \emph{also} perform meta-learning given an appropriately defined observation space. Building on this insight, we develop meta-RL with latent dynamics (MELD), an algorithm for meta-RL from images that performs inference in a latent state model to quickly acquire new skills given observations and rewards. MELD outperforms prior meta-RL methods on several simulated image-based robotic control problems, and enables a real WidowX robotic arm to insert an Ethernet cable into new locations given a sparse task completion signal after only $8$ hours of real world meta-training. To our knowledge, MELD is the first meta-RL algorithm trained in a real-world robotic control setting from images.

machine learning, meld, reinforcement learning, (15 more...)

2010.13957

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment (0.68)
Education (0.66)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Wu, Chien-Sheng, Hoi, Steven, Xiong, Caiming

Improving Limited Labeled Dialogue State Tracking with Self-Supervision

arXiv.org Artificial IntelligenceOct-26-2020

Existing dialogue state tracking (DST) models require plenty of labeled data. However, collecting high-quality labels is costly, especially when the number of domains increases. In this paper, we address a practical DST problem that is rarely discussed, i.e., learning efficiently with limited labeled data. We present and investigate two self-supervised objectives: preserving latent consistency and modeling conversational behavior. We encourage a DST model to have consistent latent distributions given a perturbed input, making it more robust to an unseen scenario. We also add an auxiliary utterance generation task, modeling a potential correlation between conversational behavior and dialogue states. The experimental results show that our proposed self-supervised signals can improve joint goal accuracy by 8.95\% when only 1\% labeled data is used on the MultiWOZ dataset. We can achieve an additional 1.76\% improvement if some unlabeled data is jointly trained as semi-supervised learning. We analyze and visualize how our proposed self-supervised signals help the DST task and hope to stimulate future data-efficient DST research.

artificial intelligence, machine learning, natural language, (16 more...)

2010.1392

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > Italy > Tuscany > Florence (0.04)
(3 more...)

Genre: Research Report > New Finding (0.34)

Industry:

Consumer Products & Services (1.00)
Transportation (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Ghosh, Indrajit, Bhattacharya, Anirban, Pati, Debdeep

Statistical optimality and stability of tangent transform algorithms in logit models

arXiv.org Machine LearningOct-25-2020

A systematic approach to finding variational approximation in an otherwise intractable non-conjugate model is to exploit the general principle of convex duality by minorizing the marginal likelihood that renders the problem tractable. While such approaches are popular in the context of variational inference in non-conjugate Bayesian models, theoretical guarantees on statistical optimality and algorithmic convergence are lacking. Focusing on logistic regression models, we provide mild conditions on the data generating process to derive non-asymptotic upper bounds to the risk incurred by the variational optima. We demonstrate that these assumptions can be completely relaxed if one considers a slight variation of the algorithm by raising the likelihood to a fractional power. Next, we utilize the theory of dynamical systems to provide convergence guarantees for such algorithms in logistic and multinomial logit regression. In particular, we establish local asymptotic stability of the algorithm without any assumptions on the data-generating process. We explore a special case involving a semi-orthogonal design under which a global convergence is obtained. The theory is further illustrated using several numerical studies.

artificial intelligence, convergence, machine learning, (19 more...)

2010.13039

Country:

North America > United States > Texas > Brazos County > College Station (0.14)
Asia > Middle East > Jordan (0.05)
North America > United States > Massachusetts (0.04)
(2 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Du, Jianzhun, Futoma, Joseph, Doshi-Velez, Finale

Model-based Reinforcement Learning for Semi-Markov Decision Processes with Neural ODEs

arXiv.org Machine LearningOct-25-2020

We present two elegant solutions for modeling continuous-time dynamics, in a novel model-based reinforcement learning (RL) framework for semi-Markov decision processes (SMDPs), using neural ordinary differential equations (ODEs). Our models accurately characterize continuous-time dynamics and enable us to develop high-performing policies using a small amount of data. We also develop a model-based approach for optimizing time schedules to reduce interaction rates with the environment while maintaining the near-optimal performance, which is not possible for model-free methods. We experimentally demonstrate the efficacy of our methods across various continuous-time domains.

arxiv preprint arxiv, machine learning, reinforcement learning, (15 more...)

2006.1621

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
North America > United States > North Carolina (0.04)
North America > United States > New York (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology > HIV (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.70)

Rubies-Royo, Vicenc, Mazumdar, Eric, Dong, Roy, Tomlin, Claire, Sastry, S. Shankar

Expert Selection in High-Dimensional Markov Decision Processes

arXiv.org Artificial IntelligenceOct-25-2020

Abstract-- In this work we present a multi-armed bandit framework for online expert selection in Markov decision processes and demonstrate its use in high-dimensional settings. Our method takes a set of candidate expert policies and switches between them to rapidly identify the best performing expert using a variant of the classical upper confidence bound algorithm, thus ensuring low regret in the overall performance of the system. This is useful in applications where several expert policies may be available, and one needs to be selected at runtime for the underlying environment. Markov decision processes (MDPs) represent a mathematical reach a set of high-reward states. For the high-dimensional framework for dealing with decision problems in case, we will use the Seaquest video game environment, many fields. It is usually hard, however, to predict how shown in Figure 1, where each expert policy is trained under changes in the underlying MDP might affect the performance different observation models.

algorithm, artificial intelligence, machine learning, (17 more...)

2010.15599

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.51)

Industry: Leisure & Entertainment > Games > Computer Games (0.86)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)