handicap
ScriptWorld: Text Based Environment For Learning Procedural Knowledge
Joshi, Abhinav, Ahmad, Areeb, Pandey, Umang, Modi, Ashutosh
Text-based games provide a framework for developing natural language understanding and commonsense knowledge about the world in reinforcement learning based agents. Existing text-based environments often rely on fictional situations and characters to create a gaming framework and are far from real-world scenarios. In this paper, we introduce ScriptWorld: a text-based environment for teaching agents about real-world daily chores and hence imparting commonsense knowledge. To the best of our knowledge, it is the first interactive text-based gaming framework that consists of daily real-world human activities designed using scripts dataset. We provide gaming environments for 10 daily activities and perform a detailed analysis of the proposed environment. We develop RL-based baseline models/agents to play the games in Scriptworld. To understand the role of language models in such environments, we leverage features obtained from pre-trained language models in the RL agents. Our experiments show that prior knowledge obtained from a pre-trained language model helps to solve real-world text-based gaming environments. We release the environment via Github: https://github.com/Exploration-Lab/ScriptWorld
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Stubborn: An Environment for Evaluating Stubbornness between Agents with Aligned Incentives
Rachum, Ram, Nakar, Yonatan, Mirsky, Reuth
Recent research in multi-agent reinforcement learning (MARL) has shown success in learning social behavior and cooperation. Social dilemmas between agents in mixed-sum settings have been studied extensively, but there is little research into social dilemmas in fullycooperative settings, where agents have no prospect of gaining reward at another agent's expense. While fully-aligned interests are conducive to cooperation between agents, they do not guarantee it. We propose a measure of "stubbornness" between agents that aims to capture the human social behavior from which it takes its name: a disagreement that is gradually escalating and potentially disastrous. We would like to promote research into the tendency of agents to be stubborn, the reactions of counterpart agents, and the resulting social dynamics. In this paper we present Stubborn, an environment for evaluating stubbornness between agents with fully-aligned incentives. In our preliminary results, the agents learn to use their partner's stubbornness as a signal for improving the choices that they make in the environment.
- Europe > United Kingdom > England > Greater London > London (0.05)
- Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.05)
- Asia > Middle East > Jordan (0.05)
Learning to be safe, in finite time
Castellano, Agustin, Bazerque, Juan, Mallada, Enrique
This paper aims to put forward the concept that learning to take safe actions in unknown environments, even with probability one guarantees, can be achieved without the need for an unbounded number of exploratory trials, provided that one is willing to relax its optimality requirements mildly. We focus on the canonical multi-armed bandit problem and seek to study the exploration-preservation trade-off intrinsic within safe learning. More precisely, by defining a handicap metric that counts the number of unsafe actions, we provide an algorithm for discarding unsafe machines (or actions), with probability one, that achieves constant handicap. Our algorithm is rooted in the classical sequential probability ratio test, redefined here for continuing tasks. Under standard assumptions on sufficient exploration, our rule provably detects all unsafe machines in an (expected) finite number of rounds. The analysis also unveils a trade-off between the number of rounds needed to secure the environment and the probability of discarding safe machines. Our decision rule can wrap around any other algorithm to optimize a specific auxiliary goal since it provides a safe environment to search for (approximately) optimal policies. Simulations corroborate our theoretical findings and further illustrate the aforementioned trade-offs.
- South America > Uruguay > Montevideo > Montevideo (0.04)
- North America > United States > New Jersey (0.04)
- North America > United States > Maryland > Baltimore (0.04)
- Information Technology > Data Science > Data Mining > Big Data (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
Asian Handicap football betting with Rating-based Hybrid Bayesian Networks
Despite the massive popularity of the Asian Handicap (AH) football betting market, it has not been adequately studied by the relevant literature. This paper combines rating systems with hybrid Bayesian networks and presents the first published model specifically developed for prediction and assessment of the AH betting market. The results are based on 13 English Premier League seasons and are compared to the traditional 1X2 market. Different betting situations have been examined including a) both average and maximum (best available) market odds, b) all possible betting decision thresholds between predicted and published odds, c) optimisations for both return-on-investment and profit, and d) simple stake adjustments to investigate how the variance of returns changes when targeting equivalent profit in both 1X2 and AH markets. While the AH market is found to share the inefficiencies of the traditional 1X2 market, the findings reveal both interesting differences as well as similarities between the two.
- Europe > United Kingdom > England > Tyne and Wear > Sunderland (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)
- Europe > Greece > Attica > Athens (0.04)
- Asia > Middle East > Iran (0.04)
- Leisure & Entertainment > Sports > Soccer (1.00)
- Banking & Finance (1.00)
SAI: a Sensible Artificial Intelligence that plays with handicap and targets high scores in 9x9 Go (extended version)
Morandin, Francesco, Amato, Gianluca, Fantozzi, Marco, Gini, Rosa, Metta, Carlo, Parton, Maurizio
We develop a new model that can be applied to any perfect information two-player zero-sum game to target a high score, and thus a perfect play. We integrate this model into the Monte Carlo tree search-policy iteration learning pipeline introduced by Google DeepMind with AlphaGo. Training this model on 9x9 Go produces a superhuman Go player, thus proving that it is stable and robust. We show that this model can be used to effectively play with both positional and score handicap. We develop a family of agents that can target high scores against any opponent, and recover from very severe disadvantage against weak opponents. To the best of our knowledge, these are the first effective achievements in this direction.
- Leisure & Entertainment > Sports (1.00)
- Leisure & Entertainment > Games > Go (1.00)
Are Robots Competing for Your Job?
"Ever since a study by the University of Oxford predicted that 47 percent of U.S. jobs are at risk of being replaced by robots and artificial intelligence over the next fifteen to twenty years, I haven't been able to stop thinking about the future of work," Andrés Oppenheimer writes, in "The Robots Are Coming: The Future of Jobs in the Age of Automation" (Vintage). Chapter 4: "They're Coming for Bankers!" Chapter 5: "They're Coming for Lawyers!" They're attacking hospitals: "They're Coming for Doctors!" They're headed to Hollywood: "They're Coming for Entertainers!" I gather they have not yet come for the manufacturers of exclamation points. The old robots were blue-collar workers, burly and clunky, the machines that rusted the Rust Belt. But, according to the economist Richard Baldwin, in "The Globotics Upheaval: Globalization, Robotics, and the Future of Work" (Oxford), the new ones are "white-collar robots," knowledge workers and quinoa-and-oat-milk globalists, the machines that will bankrupt Brooklyn.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.25)
- Oceania > New Zealand (0.05)
- North America > United States (0.05)
- Asia > Middle East > UAE > Dubai Emirate > Dubai (0.05)
China catching up to US in race towards artificial intelligence
Artificial intelligence from a Chinese tech giant has defeated the country's best player of the board game Go, despite giving the grandmaster an advantage -- matching and perhaps surpassing Google's efforts last year. The artificial intelligence (AI) developed by Chinese company Tencent beat world number-two Go player Ke Jie last week with a two-stone handicap, the official People's Daily newspaper reported. Handicaps are used in Go to even out the difference in skill level between players. Google's AlphaGo AI beat Ke last year just months after defeating fellow grandmaster Lee Se-dol of South Korea -- however AlphaGo has never competed against top-level players using a handicap. AlphaGo has since been placed in retirement, with Google instead focusing its energies on its self-teaching AlphaGo Zero machine, which mastered the complex game in 40 days last year.
- Asia > China (0.65)
- North America > United States (0.31)
- Asia > South Korea (0.27)
- (5 more...)
- Leisure & Entertainment > Games > Go (1.00)
- Information Technology (1.00)
How an AI caddie could improve your golf game
Stamford, Conn.-based Arccos Golf Wednesday launched a first-of-its-kind artificial intelligence (AI) caddie for golf, saying the platform will help golfers of all skill levels achieve lower scores with the power of data-driven decisions. Dubbed Arccos Caddie, the platform is powered by the Microsoft Azure cloud platform and trained on a data set comprised of more than 61 million shots hit by the users of Arccos' golf-tracking system, elevation data and 386 million geotagged data points on more than 40,000 golf courses. Arccos notes the AI platform also accounts for weather conditions, including forecasted wind speed, wind direction, precipitation and temperature. "Every shot in golf involves a decision-making process, and the caddie's role has historically been to help you make more intelligent choices," said Sal Syed, CEO and co-founder of Arccos. "Today, however, less than three percent of players have access to a caddie.
- Information Technology > Artificial Intelligence (0.96)
- Information Technology > Cloud Computing (0.64)
- Information Technology > Communications > Social Media (0.40)
Google's AI beats a professional Go player, an industry first
Google has achieved something major in artificial intelligence (AI) research. A computer system it has built to play the ancient Chinese board game Go has managed to win a match against a professional Go player: the European champion Fan Hui. The research is documented in a paper in this week's issue of the journal Nature. The Google system, named AlphaGo, swept France's Hui, who is ranked a 2-dan, in a five-game match at the Google DeepMind office in London in October. AlphaGo played against Hui on a full 19-by-19 Go board and received no handicap.
Here's how Apple plans to protect privacy and still compete on AI
A theory has taken hold in tech: Apple's devotion to privacy will handicap it during the next major wave of computing, where artificial intelligence like voice interaction, personal assistants and automation take center stage. This morning Apple gave its response: It won't handicap us, because we can do both. A concept called "differential privacy" -- an en vogue statistical method designed to reap useful intel from big piles of data while protecting personally identifying information therein. Apple has branded itself as antithetical to Google and Facebook, companies that rely on reams of data. But Apple also wants to provide the perks these companies offer -- more smart, personalized services -- that require reams of data.
- Information Technology > Security & Privacy (0.86)
- Health & Medicine > Therapeutic Area > Hematology (0.52)