Collaborating Authors

reinforcement learning

Engineers pre-train AI computers to make them even more powerful


The main drawback to reinforcement learning is that it can't be used in some real-life applications. That's because in the process of training themselves, computers initially try just about anything and everything before eventually stumbling on the right path. This initial trial-and-error phase can be problematic for certain applications, such as climate-control systems where abrupt swings in temperature wouldn't be tolerated. The CSEM engineers have developed an approach that overcomes this problem. They showed that computers can first be trained on extremely simplified theoretical models before being set to learn on real-life systems.

Scientists use reinforcement learning to train quantum algorithm


Recent advancements in quantum computing have driven the scientific community's quest to solve a certain class of complex problems for which quantum computers would be better suited than traditional supercomputers. To improve the efficiency with which quantum computers can solve these problems, scientists are investigating the use of artificial intelligence approaches. In a new study, scientists at the U.S. Department of Energy's (DOE) Argonne National Laboratory have developed a new algorithm based on reinforcement learning to find the optimal parameters for the Quantum Approximate Optimization Algorithm (QAOA), which allows a quantum computer to solve certain combinatorial problems such as those that arise in materials design, chemistry and wireless communications. "Combinatorial optimization problems are those for which the solution space gets exponentially larger as you expand the number of decision variables," said Argonne computer scientist Prasanna Balaprakash. "In one traditional example, you can find the shortest route for a salesman who needs to visit a few cities once by enumerating all possible routes, but given a couple thousand cities, the number of possible routes far exceeds the number of stars in the universe; even the fastest supercomputers cannot find the shortest route in a reasonable time."

The Future of AI Part 1


It was reported that Venture Capital investments into AI related startups made a significant increase in 2018, jumping by 72% compared to 2017, with 466 startups funded from 533 in 2017. PWC moneytree report stated that that seed-stage deal activity in the US among AI-related companies rose to 28% in the fourth-quarter of 2018, compared to 24% in the three months prior, while expansion-stage deal activity jumped to 32%, from 23%. There will be an increasing international rivalry over the global leadership of AI. President Putin of Russia was quoted as saying that "the nation that leads in AI will be the ruler of the world". Billionaire Mark Cuban was reported in CNBC as stating that "the world's first trillionaire would be an AI entrepreneur".

Using deep learning to control the unconsciousness level of patients in an anesthetic state


In recent years, researchers have been developing machine learning algorithms for an increasingly wide range of purposes. This includes algorithms that can be applied in healthcare settings, for instance helping clinicians to diagnose specific diseases or neuropsychiatric disorders or monitor the health of patients over time. Researchers at Massachusetts Institute of Technology (MIT) and Massachusetts General Hospital have recently carried out a study investigating the possibility of using deep reinforcement learning to control the levels of unconsciousness of patients who require anesthesia for a medical procedure. Their paper, set to be published in the proceedings of the 2020 International Conference on Artificial Intelligence in Medicine, was voted the best paper presented at the conference. "Our lab has made significant progress in understanding how anesthetic medications affect neural activity and now has a multidisciplinary team studying how to accurately determine anesthetic doses from neural recordings," Gabriel Schamberg, one of the researchers who carried out the study, told TechXplore.

A robot triumphs in a curling match against elite humans


A robot equipped with artificial intelligence (AI) can excel at the Olympic sport of curling -- and even beat top-level human teams. Success requires precision and strategy, but the game is less complex than other real-world applications of robotics. That makes curling a useful test case for AI technologies, which often perform well in simulations but falter in real-world scenarios with changing conditions. Using a method called adaptive deep reinforcement learning, Seong-Whan Lee and his colleagues at Korea University in Seoul created an algorithm that learns through trial and error to adjust a robot's throws to account for changing conditions, such as the ice surface and the positions of stones. The team's robot, nicknamed Curly, needed a few test throws to calibrate itself to the curling rink where it was to compete.

Watch a Robot AI Beat World-Class Curling Competitors


Artificial intelligence still needs to bridge the "sim-to-real" gap. Deep-learning techniques that are all the rage in AI log superlative performances in mastering cerebral games, including chess and Go, both of which can be played on a computer. But translating simulations to the physical world remains a bigger challenge. A robot named Curly that uses "deep reinforcement learning"--making improvements as it corrects its own errors--came out on top in three of four games against top-ranked human opponents from South Korean teams that included a women's team and a reserve squad for the national wheelchair team. One crucial finding was that the AI system demonstrated its ability to adapt to changing ice conditions.

Exploring exploration: comparing children with RL agents in unified environments


Despite recent advances in artificial intelligence (AI) research, human children are still by far the best learners we know of, learning impressive skills like language and high-level reasoning from very little data. Children's learning is supported by highly efficient, hypothesis-driven exploration: in fact, they explore so well that many machine learning researchers have been inspired to put videos like the one below in their talks to motivate research into exploration methods. However, because applying results from studies in developmental psychology can be difficult, this video is often the extent to which such research actually connects with human cognition. Why is directly applying research from developmental psychology to problems in AI so hard? For one, taking inspiration from developmental studies can be difficult because the environments that human children and artificial agents are typically studied in can be very different. Traditionally, reinforcement learning (RL) research takes place in grid-world-like settings or other 2D games, whereas children act in the real world which is rich and 3-dimensional.

[R] GRAC: Self-Guided and Self-Regularized Actor-Critic


Abstract: Deep reinforcement learning (DRL) algorithms have successfully been demonstrated on a range of challenging decision making and control tasks. One dominant component of recent deep reinforcement learning algorithms is the target network which mitigates the divergence when learning the Q function. However, target networks can slow down the learning process due to delayed function updates. Another dominant component especially in continuous domains is the policy gradient method which models and optimizes the policy directly. However, when Q functions are approximated with neural networks, their landscapes can be complex and therefore mislead the local gradient.

UC Berkeley Reward-Free RL Beats SOTA Reward-Based RL


End-to-end Deep Reinforcement Learning (DRL) is a trending training approach in the field of computer vision, where it has proven successful at solving a wide range of complex tasks that were previously regarded as out of reach. End-to-end DRL is now being applied in domains ranging from real-world and simulated robotics to sophisticated video games. However, as appealing as end-to-end DRL methods are, most rely heavily on reward functions in order to learn visual features. This means feature-learning suffers when rewards are sparse, which is the case in most real-world scenarios. ATC trains a convolutional encoder to associate pairs of observations separated by a short time difference. Random shift, a stochastic data augmentation to the observations is applied within each training batch.

[D] Quality Contributions Roundup 9/14


Though the community continues to develop new algorithms, state-of-the-art results have stopped improving in the last couple of years. Since RL algorithms that use a tremendous amount of online data to learn from scratch are infeasible to apply in the real-world, much research has moved to fields such as Meta-RL, offline RL, and integrating RL with domain-knowledge, integrating RL and planning, etc. How do you unit test end-to-end ML pipelines?, by u/farmingvillein As perhaps a bit of tldr: once you've got the bare minimum data-replay testing in place ("yeah, it is probably working, because the results are pretty close to what they were before"), I'd encourage you to consider focusing your energy toward thinking of testing as outlier detection. Outliers, in real-world ML systems, tend to be harbingers of things that are wrong systematically, upstream data problems, and logic (pre-/post-processing) problems. How do you transition from a no name international college to FAIR/Brain?, by u/r-sync Coming from a no-name Indian engineering college with meh grades, you do have to get a bit creative, very persistent and build credibility for yourself. The examples above are one way to do so, but you can also maybe articulate your thoughts as really good blog posts and arxiv papers, or show great software engineering skills in open-source (i.e.