AITopics

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Banking & Finance > Trading (0.99)
Education > Educational Setting > Online (0.32)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.85)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.58)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.58)

#artificialintelligenceNov-15-2020, 02:40:22 GMT

Important AI and Machine Learning Trends for 2020

Businesses that range from high tech startups to international multinationals see artificial intelligence as a crucial competitive edge in an increasingly technical and competitive sector. However, the AI industry goes so fast that it is often difficult to adhere to the most recent research discoveries and accomplishments, and even more difficult to employ technological results to achieve business results. To assist you to create a strong AI plan for your company in 2020, I have outlined the hottest trends across various research areas, such as natural language processing, conversational AI, computer vision, and reinforcement learning. I have also included outside education it is possible to follow to enhance your experience. In 2018, pre-trained language versions pushed the limitations of natural language understanding and production.

ai and machine learning trend, learning, pre-trained language version, (12 more...)

Genre:

Research Report > New Finding (0.51)
Research Report > Promising Solution (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.39)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.35)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.32)

Karunakaran, Dhanoop, Worrall, Stewart, Nebot, Eduardo

Efficient falsification approach for autonomous vehicle validation using a parameter optimisation technique based on reinforcement learning

arXiv.org Artificial IntelligenceNov-15-2020

The widescale deployment of Autonomous Vehicles (AV) appears to be imminent despite many safety challenges that are yet to be resolved. It is well-known that there are no universally agreed Verification and Validation (VV) methodologies guarantee absolute safety, which is crucial for the acceptance of this technology. The uncertainties in the behaviour of the traffic participants and the dynamic world cause stochastic reactions in advanced autonomous systems. The addition of ML algorithms and probabilistic techniques adds significant complexity to the process for real-world testing when compared to traditional methods. Most research in this area focuses on generating challenging concrete scenarios or test cases to evaluate the system performance by looking at the frequency distribution of extracted parameters as collected from the real-world data. These approaches generally employ Monte-Carlo simulation and importance sampling to generate critical cases. This paper presents an efficient falsification method to evaluate the System Under Test. The approach is based on a parameter optimisation problem to search for challenging scenarios. The optimisation process aims at finding the challenging case that has maximum return. The method applies policy-gradient reinforcement learning algorithm to enable the learning. The riskiness of the scenario is measured by the well established RSS safety metric, euclidean distance, and instance of a collision. We demonstrate that by using the proposed method, we can more efficiently search for challenging scenarios which could cause the system to fail in order to satisfy the safety requirements.

controller, scenario, vehicle, (16 more...)

2011.07699

Country:

Oceania > Australia (0.04)
Europe > Netherlands (0.04)

Genre: Research Report (0.50)

Industry:

Automobiles & Trucks (1.00)
Transportation > Ground > Road (0.94)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Pasunuru, Ramakanth, Guo, Han, Bansal, Mohit

DORB: Dynamically Optimizing Multiple Rewards with Bandits

arXiv.org Artificial IntelligenceNov-15-2020

Policy gradients-based reinforcement learning has proven to be a promising approach for directly optimizing non-differentiable evaluation metrics for language generation tasks. However, optimizing for a specific metric reward leads to improvements in mostly that metric only, suggesting that the model is gaming the formulation of that metric in a particular way without often achieving real qualitative improvements. Hence, it is more beneficial to make the model optimize multiple diverse metric rewards jointly. While appealing, this is challenging because one needs to manually decide the importance and scaling weights of these metric rewards. Further, it is important to consider using a dynamic combination and curriculum of metric rewards that flexibly changes over time. Considering the above aspects, in our work, we automate the optimization of multiple metric rewards simultaneously via a multi-armed bandit approach (DORB), where at each round, the bandit chooses which metric reward to optimize next, based on expected arm gains. We use the Exp3 algorithm for bandits and formulate two approaches for bandit rewards: (1) Single Multi-reward Bandit (SM-Bandit); (2) Hierarchical Multi-reward Bandit (HM-Bandit). We empirically show the effectiveness of our approaches via various automatic metrics and human evaluation on two important NLG tasks: question generation and data-to-text generation, including on an unseen-test transfer setup. Finally, we present interpretable analyses of the learned bandit curriculum over the optimized rewards.

bandit, bansal, proceedings, (13 more...)

2011.07635

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report > Promising Solution (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Data Science > Data Mining > Big Data (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.67)

arXiv.org Artificial IntelligenceNov-15-2020

Placement in Integrated Circuits using Cyclic Reinforcement Learning and Simulated Annealing

Vashisht, Dhruv, Rampal, Harshit, Liao, Haiguang, Lu, Yang, Shanbhag, Devika, Fallon, Elias, Kara, Levent Burak

Physical design and production of Integrated Circuits (IC) is becoming increasingly more challenging as the sophistication in IC technology is steadily increasing. Placement has been one of the most critical steps in IC physical design. Through decades of research, partition-based, analytical-based and annealing-based placers have been enriching the placement solution toolbox. However, open challenges including long run time and lack of ability to generalize continue to restrict wider applications of existing placement tools. We devise a learning-based placement tool based on cyclic application of Reinforcement Learning (RL) and Simulated Annealing (SA) by leveraging the advancement of RL. Results show that the RL module is able to provide a better initialization for SA and thus leads to a better final placement design. Compared to other recent learning-based placers, our method is majorly different with its combination of RL and SA. It leverages the RL model's ability to quickly get a good rough solution after training and the heuristic's ability to realize greedy improvements in the solution.

algorithm, initialization, placement, (14 more...)

2011.07577

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.16)
North America > United States > North Carolina (0.04)
North America > United States > California > Santa Clara County > San Jose (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Semiconductors & Electronics (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.63)

#artificialintelligenceNov-14-2020, 22:16:32 GMT

Deep reinforcement learning for RAN optimization and control

Due to the high variability of the traffic in the radio access network (RAN), fixed network configurations are not flexible to achieve the optimal performance. Our vendors provide several settings of the eNodeB to optimize the RAN performance, such as media access control scheduler, loading balance, etc. But the detailed mechanisms of the eNodeB configurations are usually very complicated and not disclosed, not to mention the large KPIs space needed to be considered. We aim to build an intelligent controller without strong assumption or domain knowledge about the RAN and can run for 24/7 without supervision. To achieve this goal, we first build a closed-loop control testbed RAN in a lab environment with one eNodeB provided by one of the largest wireless vendors and four smartphones. Next, we build a double Q network agent that is trained with the live feedbacks of the key performance indicators from the RAN.

deep reinforcement, ran optimization and control, reinforcement, (2 more...)

Industry:

Telecommunications (0.64)
Information Technology (0.64)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.96)

#artificialintelligenceNov-14-2020, 05:56:32 GMT

Please human, can you teach me how to AI?

In the exploding era of computing (ubiquitous, mobile, quantum or whatever suits you better) there's still a sacred Graal we struggle to reach without success, even if we look closer every Moore's law step we advance: Artificial General Intelligence (AGI). Back in 2010 or so, in my days as Bioengineering MSc at University, I had my 10 minutes epiphany. I suddenly pictured that, some day, a reinforcement learning implementation general enough on a hardware powerful and beautiful enough might have led to a so-called strong artificial intelligence or artificial general intelligence. Indeed for those who do not chew machine learning at breakfast, this may look something really cool, but moving to a more concrete reality my realization was much more pragmatic. In "traditional" Artificial Intelligence approaches, you pick for a task (the one you think it is worthy enough to be tackled) and put in place a supervised learning technique.

agi, intelligence, please human, (1 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.58)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.39)

#artificialintelligenceNov-14-2020, 02:25:39 GMT

Cutting-Edge AI: Deep Reinforcement Learning in Python

Created by Lazy Programmer Inc. English [Auto-generated] Created by Lazy Programmer Inc. This is technically Deep Learning in Python part 11 of my deep learning series, and my 3rd reinforcement learning course. Deep Reinforcement Learning is actually the combination of 2 topics: Reinforcement Learning and Deep Learning (Neural Networks). While both of these have been around for quite some time, it's only been recently that Deep Learning has really taken off, and along with it, Reinforcement Learning. The maturation of deep learning has propelled advances in reinforcement learning, which has been around since the 1980s, although some aspects of it, such as the Bellman equation, have been for much longer.

deep reinforcement learning, learning, reinforcement learning, (10 more...)

Industry:

Education (0.38)
Leisure & Entertainment > Games (0.32)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

#artificialintelligenceNov-14-2020, 02:21:17 GMT

A deep Q-Learning based Path Planning and Navigation System for Firefighting Environments

Live fire creates a dynamic, rapidly changing environment that presents a worthy challenge for deep learning and artificial intelligence methodologies to assist firefighters with scene comprehension in maintaining their situational awareness, tracking and relay of important features necessary for key decisions as they tackle these catastrophic events. We propose a deep Q-learning based agent who is immune to stress induced disorientation and anxiety and thus able to make clear decisions for navigation based on the observed and stored facts in live fire environments. As a proof of concept, we imitate structural fire in a gaming engine called Unreal Engine which enables the interaction of the agent with the environment. The agent is trained with a deep Q-learning algorithm based on a set of rewards and penalties as per its actions on the environment. We exploit experience replay to accelerate the learning process and augment the learning of the agent with human-derived experiences.

agent, firefighting environment, path planning and navigation system, (2 more...)

Country: North America > United States > New Mexico > Los Alamos County > Los Alamos (0.09)

Industry: Law Enforcement & Public Safety > Fire & Emergency Services (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Bodnar, Cristian, Hausman, Karol, Dulac-Arnold, Gabriel, Jonschkowski, Rico

A Geometric Perspective on Self-Supervised Policy Adaptation

arXiv.org Artificial IntelligenceNov-14-2020

One of the most challenging aspects of real-world reinforcement learning (RL) is the multitude of unpredictable and ever-changing distractions that could divert an agent from what was tasked to do in its training environment. While an agent could learn from reward signals to ignore them, the complexity of the real-world can make rewards hard to acquire, or, at best, extremely sparse. A recent class of self-supervised methods have shown promise that reward-free adaptation under challenging distractions is possible. However, previous work focused on a short one-episode adaptation setting. In this paper, we consider a long-term adaptation setup that is more akin to the specifics of the real-world and propose a geometric perspective on self-supervised adaptation. We empirically describe the processes that take place in the embedding space during this adaptation process, reveal some of its undesirable effects on performance and show how they can be eliminated. Moreover, we theoretically study how actor-based and actor-free agents can further generalise to the target environment by manipulating the geometry of the manifolds described by the actor and critic functions.

adaptation, representation, source environment, (14 more...)

2011.07318

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > California > Santa Clara County > Mountain View (0.04)
North America > United States > Virginia > Arlington County > Arlington (0.04)
(3 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)