esper
You Can't Count on Luck: Why Decision Transformers and RvS Fail in Stochastic Environments
Recently, methods such as Decision Transformer that reduce reinforcement learning to a prediction task and solve it via supervised learning (RvS) have become popular due to their simplicity, robustness to hyperparameters, and strong overall performance on offline RL tasks. However, simply conditioning a probabilistic model on a desired return and taking the predicted action can fail dramatically in stochastic environments since trajectories that result in a return may have only achieved that return due to luck. In this work, we describe the limitations of RvS approaches in stochastic environments and propose a solution. Rather than simply conditioning on returns, as is standard practice, our proposed method, ESPER, conditions on learned average returns which are independent from environment stochasticity. Doing so allows ESPER to achieve strong alignment between target return and expected performance in real environments. We demonstrate this in several challenging stochastic offline-RL tasks including the challenging puzzle game 2048, and Connect Four playing against a stochastic opponent. In all tested domains, ESPER achieves significantly better alignment between the target return and achieved return than simply conditioning on returns. ESPER also achieves higher maximum performance than even the value-based baselines.
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- Europe > Austria (0.04)
- (3 more...)
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- Europe > Austria (0.04)
- (3 more...)
You Can't Count on Luck: Why Decision Transformers and RvS Fail in Stochastic Environments
Recently, methods such as Decision Transformer that reduce reinforcement learning to a prediction task and solve it via supervised learning (RvS) have become popular due to their simplicity, robustness to hyperparameters, and strong overall performance on offline RL tasks. However, simply conditioning a probabilistic model on a desired return and taking the predicted action can fail dramatically in stochastic environments since trajectories that result in a return may have only achieved that return due to luck. In this work, we describe the limitations of RvS approaches in stochastic environments and propose a solution. Rather than simply conditioning on returns, as is standard practice, our proposed method, ESPER, conditions on learned average returns which are independent from environment stochasticity. Doing so allows ESPER to achieve strong alignment between target return and expected performance in real environments. We demonstrate this in several challenging stochastic offline-RL tasks including the challenging puzzle game 2048, and Connect Four playing against a stochastic opponent.
You Can't Count on Luck: Why Decision Transformers and RvS Fail in Stochastic Environments
Paster, Keiran, McIlraith, Sheila, Ba, Jimmy
Recently, methods such as Decision Transformer that reduce reinforcement learning to a prediction task and solve it via supervised learning (RvS) have become popular due to their simplicity, robustness to hyperparameters, and strong overall performance on offline RL tasks. However, simply conditioning a probabilistic model on a desired return and taking the predicted action can fail dramatically in stochastic environments since trajectories that result in a return may have only achieved that return due to luck. In this work, we describe the limitations of RvS approaches in stochastic environments and propose a solution. Rather than simply conditioning on the return of a single trajectory as is standard practice, our proposed method, ESPER, learns to cluster trajectories and conditions on average cluster returns, which are independent from environment stochasticity. Doing so allows ESPER to achieve strong alignment between target return and expected performance in real environments. We demonstrate this in several challenging stochastic offline-RL tasks including the challenging puzzle game 2048, and Connect Four playing against a stochastic opponent. In all tested domains, ESPER achieves significantly better alignment between the target return and achieved return than simply conditioning on returns. ESPER also achieves higher maximum performance than even the value-based baselines.
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- Europe > Austria (0.04)
- (3 more...)
NPS' Data Science, AI Certificate Programs Support DOD Workforce Development
On Sept. 9, during the DOD's semi-annual Artificial Intelligence Symposium and Exposition, Secretary of Defense Mark Esper affirmed that the Joint Artificial Intelligence Center (JAIC) in partnership with the Naval Postgraduate School (NPS) and Defense Acquisition University will collaboratively develop an intensive six-week pilot course delivered to more than 80 defense acquisition professionals of all ranks and grades. "These trainees will learn how to apply AI and data science skills to our operations," Esper said in his remarks. "With the support of Congress, the Department plans to request additional funding for the services to grow this effort over time and deliver an AI-ready workforce to the American people." Just as the university's highly-regarded Harnessing Artificial Intelligence video course paved the way for its support of the pilot course, NPS is well positioned to support Esper's declaration for further workforce development through its existing Data Science Certificate, and an upcoming similar certificate program in Artificial Intelligence. In the ongoing effort to expand the Navy's knowledge and expertise in the fields of data science and artificial intelligence, NPS faculty have developed courses that enable students to quickly gain insights in these critical disciplines.
- Government > Regional Government > North America Government > United States Government (1.00)
- Government > Military (1.00)
- Information Technology > Data Science (1.00)
- Information Technology > Artificial Intelligence (1.00)
Eyeing China, Pentagon plans larger and 'more lethal' navy
Washington – U.S. Secretary of Defense Mark Esper announced Wednesday an ambitious plan to expand the U.S. Navy with a range of unmanned and autonomous ships, submarines and aircraft to confront the growing maritime challenge from China. The Pentagon chief said a sweeping review of U.S. naval power dubbed "Future Forward" had laid out a "game-changer" plan that would expand the U.S. sea fleet to more than 355 ships, from the current 293. The plan, which requires adding tens of billions of dollars to the U.S. Navy's budget between now and 2045, is aimed at maintaining superiority over Chinese naval forces, seen as the primary threat to the United States. "The future fleet will be more balanced in its ability to deliver lethal effects from the air, from the sea, and from under the sea," Esper said in a speech at the Rand Corp. in California. The expansion will add "more and smaller" surface ships; more submarines; surface and subsurface vessels that are optionally manned, unmanned and autonomous; and a broad range of unmanned carrier-based aircraft.
- North America > United States > California (0.26)
- Asia > China > Beijing > Beijing (0.06)
- Government > Regional Government > North America Government > United States Government (1.00)
- Government > Military (1.00)
Pentagon to pit AI against human pilots in live fighter trials
U.S. Defense Secretary Mark Esper announced Wednesday that the Pentagon intends to conduct live trials pitting tactical aircraft controlled by artificial intelligence against human pilots in 2024. The announcement comes three weeks after an AI algorithm defeated a human pilot in a simulated dogfight between F-16s, something Esper described as an example of the "tectonic impact of machine learning" for the Defense Department's future. "The AI agent's resounding victory demonstrated the ability of advanced algorithms to outperform humans in virtual dogfights. These simulations will culminate in a real-world competition involving full-scale tactical aircraft in 2024," Esper said in prepared remarks delivered to the department's Artificial Intelligence Symposium. The Aug. 20 test was the finale of the Pentagon research agency's AI air combat competition.
- Government > Regional Government > North America Government > United States Government (1.00)
- Government > Military (1.00)
AI to take on human pilots in real-world fighter aircraft trials
AI will face off against human pilots in real-world fighter aircraft by 2024, Secretary of Defense Mark Esper revealed on Wednesday. The Pentagon announced the plan a month after an AI system demolished an Air Force pilot in a virtual dogfight. An algorithm developed by defense contractor Heron Systems swept a best-of-five aerial duel versus an F-16 pilot wearing a VR helmet. The new trials will test how the AI's capabilities transfer to the real world, Esper explained on Wednesday at the Pentagon's first AI Symposium: The AI agent's resounding victory demonstrated the ability of advanced algorithms to out-perform humans in virtual dogfights. To be clear, AI's role in our lethality is to support human decision-makers, not replace them.
- North America > United States (0.38)
- Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.06)
- Asia > Russia (0.06)
- Asia > China (0.06)
- Government > Military > Air Force (1.00)
- Government > Regional Government > North America Government > United States Government (0.38)
AI jet will go head-to-head in a real aerial dual with Top Gun pilot – IAM Network
An artificial intelligence will pilot a fighter jet in a real life aerial dual with a Top Gun aviator after an AI won a simulated dogfight last month, Defense Secretary revealsThe Pentagon is set to pit an AI-controlled Jet against a pilot in a real-life aerial battle in 2024, Defense Secretary Mark Esper announced News of the slated battle comes just weeks after an artificial intelligence program defeated an F-16 fighter pilot in a virtual dogfightEsper made the announcement during a speech at the Defense Department's AI Symposium 2020 on Wednesday Esper hailed the'tectonic impact of machine learning on the future of warfighting' after the AI algorithm annihilated the human pilot on August 20While Esper provided no additional details on the face-off, he assured troops that AI will integrated help to enhance US warfighting, not replace pilots By Luke Kenton For Dailymail.com Published: 17:15 EDT, 10 September 2020 Updated: 17:23 EDT, 10 September 2020 Just weeks after an artificial intelligence program defeated an F-16 fighter pilot in a virtual dogfight, the Pentagon is set to raise the stakes by pitting an AI-controlled jet against a pilot in a real-life aerial battle, Defense Secretary Mark Esper announced.Esper unveiled the plans for the battle, slated for 2024, …