South America
Unit Dependency Graph and Its Application to Arithmetic Word Problem Solving
Roy, Subhro (University of Illinois, Urbana Champaign) | Roth, Dan (University of Illinois, Urbana Champaign)
Math word problems provide a natural abstraction to a range of natural language understanding problems that involve reasoning about quantities, such as interpreting election results, news about casualties, and the financial section of a newspaper. Units associated with the quantities often provide information that is essential to support this reasoning. This paper proposes a principled way to capture and reason about units and shows how it can benefit an arithmetic word problem solver. This paper presents the concept of Unit Dependency Graphs (UDGs), which provides a compact representation of the dependencies between units of numbers mentioned in a given problem. Inducing the UDG alleviates the brittleness of the unit extraction system and allows for a natural way to leverage domain knowledge about unit compatibility, for word problem solving. We introduce a decomposed model for inducing UDGs with minimal additional annotations, and use it to augment the expressions used in the arithmetic word problem solver of (Roy and Roth 2015) via a constrained inference framework. We show that introduction of UDGs reduces the error of the solver by over 10 %, surpassing all existing systems for solving arithmetic word problems. In addition, it also makes the system more robust to adaptation to new vocabulary and equation forms .
Nash Stability in Social Distance Games
Balliu, Alkida (Gran Sasso Science Institute) | Flammini, Michele (University of L'Aquila and Gran Sasso Science Institute) | Melideo, Giovanna (University of L'Aquila) | Olivetti, Dennis (Gran Sasso Science Institute)
In this paper we focus on Social Distance Games (SDGs), Coalition formation is a pervasive aspect of social life and an important subclass of HGs introduced in (Brânzei and it has been studied extensively in algorithmic game theory Larson 2011) where agent utilities are based on the concept using the natural model of Hedonic Games (HGs), introduced of social distance (i.e., the number of hops required to reach in (Dreze and Greenberg 1980) and further explored one node from another), which has become famous since in (Aziz, Brandt, and Harrenstein 2011; Aziz, Brandt, and Milgram's study on six degrees of separation. In SDGs the Seedig 2013; Banerjee, Konishi, and Sönmez 2001; Bogomolnaia utility of an agent is given by the average inverse distance and Jackson 2002; Elkind and Wooldridge 2009; from all the other nodes in her coalition, that is by her harmonic Elkind, Fanelli, and Flammini 2016; Gairing and Savani centrality (Boldi and Vigna 2014) divided by the size 2010). A HG consists of a set of selfish agents (humans, of the coalition. The basic idea is that the agents prefer to robots, software agents, etc.) having preferences over coalitions maintain ties with other agents who are close to them. The that might include them, regardless of which other utility formulation is a variant of the closeness centrality and coalitions may or may not be present. The outcome is a partition reflects the principle of homophily, that similarity breeds of the agent set into disjoint coalitions (or clusters), connection and people tend to form communities with similar referred to as a clustering or coalition structure.
Taming the Matthew Effect in Online Markets with Social Influence
Berbeglia, Franco (Carnegie Mellon University) | Hentenryck, Pascal Van (University of Michigan)
The songs are organized in a monopoly in the long run. This "winner-takes-all" phenomena, a list or matrix form, giving different visibilities to the various although optimal from an efficiency standpoint, is songs, as is typically the case in online advertisement, typically considered undesirable.. online stores, or physical retail stores (e.g., (Craswell et al. This paper proposes a novel strategy that aims at addressing 2008; Lim, Rodrigues, and Zhang 2004)). Each song was the three problems identified by Salganik, Dodds, also associated with a popularity signal (e.g., (Engstrom and and Watts (2006) simultaneously: unpredictability, inefficiencies, Forsell 2014; Viglia, Furlan, and Ladrón-de Guevara 2014)), and inequalities. The strategy is a randomized segmentation i.e., the number of downloads of the song by earlier market protocol and is simple to deploy in online settings.
Improving Deep Reinforcement Learning with Knowledge Transfer
Glatt, Ruben (Universidade de São Paulo) | Costa, Anna Helena Reali (Universidade de São Paulo)
Recent successes in applying Deep Learning techniques on Reinforcement Learning algorithms have led to a wave of breakthrough developments in agent theory and established the field of Deep Reinforcement Learning (DRL). While DRL has shown great results for single task learning, the multi-task case is still underrepresented in the available literature. This D.Sc. research proposal aims at extending DRL to the multi- task case by leveraging the power of Transfer Learning algorithms to improve the training time and results for multi-task learning. Our focus lies on defining a novel framework for scalable DRL agents that detects similarities between tasks and balances various TL techniques, like parameter initialization, policy or skill transfer.
Accelerating Multiagent Reinforcement Learning through Transfer Learning
Silva, Felipe Leno da (Escola Politécnica da Universidade de São Paulo) | Costa, Anna Helena Reali (Escola Politécnica da Universidade de São Paulo)
Reinforcement Learning (RL) is a widely used solution for sequential decision-making problems and has been used in many complex domains. However, RL algorithms suffer from scalability issues, especially when multiple agents are acting in a shared environment. This research intends to accelerate learning in multiagent sequential decision-making tasks by reusing previous knowledge, both from past solutions and advising between agents. We intend to contribute a Transfer Learning framework focused on Multiagent RL, requiring as few domain-specific hand-coded parameters as possible.
Policy Reuse in Deep Reinforcement Learning
Glatt, Ruben (Universidade de São Paulo) | Costa, Anna Helena Reali (Universidade de São Paulo)
Driven by recent developments in Artificial Intelligence research, a promising new technology for building intelligent agents has evolved. The approach is termed Deep Reinforcement Learning and combines the classic field of Reinforcement Learning (RL) with the representational power of modern Deep Learning approaches. It is very well suited for single task learning but needs a long time to learn any new task. To speed up this process, we propose to extend the concept to multi-task learning by adapting Policy Reuse, a Transfer Learning approach from classic RL, to use with Deep Q-Networks.
An Advising Framework for Multiagent Reinforcement Learning Systems
Silva, Felipe Leno da (Escola Politécnica da Universidade de São Paulo) | Glatt, Ruben (Escola Politécnica da Universidade de São Paulo) | Costa, Anna Helena Reali (Escola Politécnica da Universidade de São Paulo)
Reinforcement Learning has long been employed to solve sequential decision-making problems with minimal input data. However, the classical approach requires a long time to learn a suitable policy, especially in Multiagent Systems. The teacher-student framework proposes to mitigate this problem by integrating an advising procedure in the learning process, in which an experienced agent (human or not) can advise a student to guide her exploration. However, the teacher is assumed to be an expert in the learning task. We here propose an advising framework where multiple agents advise each other while learning in a shared environment, and the advisor is not expected to necessarily act optimally. Our experiments in a simulated Robot Soccer environment show that the learning process is improved by incorporating this kind of advice.
Spatial Projection of Multiple Climate Variables Using Hierarchical Multitask Learning
Goncalves, Andre R. (Center for Research and Development in Telecommunication (CPqD)) | Banerjee, Arindam (University of Minnesota - Twin Cities) | Zuben, Fernando J. Von (University of Campinas)
Future projection of climate is typically obtained by combining outputs from multiple Earth System Models (ESMs) for several climate variables such as temperature and precipitation. While IPCC has traditionally used a simple model output average, recent work has illustrated potential advantages of using a multitask learning (MTL) framework for projections of individual climate variables. In this paper we introduce a framework for hierarchical multitask learning (HMTL) with two levels of tasks such that each super-task, i.e., task at the top level, is itself a multitask learning problem over sub-tasks. For climate projections, each super-task focuses on projections of specific climate variables spatially using an MTL formulation. For the proposed HMTL approach, a group lasso regularization is added to couple parameters across the super-tasks, which in the climate context helps exploit relationships among the behavior of different climate variables at a given spatial location. We show that some recent works on MTL based on learning task dependency structures can be viewed as special cases of HMTL. Experiments on synthetic and real climate data show that HMTL produces better results than decoupled MTL methods applied separately on the super-tasks and HMTL significantly outperforms baselines for climate projection.
Grid Pathfinding on the 2 k Neighborhoods
Rivera, Nicolas (King's College London) | Hernández, Carlos (Universidad Andrés Bello) | Baier, Jorge A. (Pontificia Universidad Catolica de Chile)
Grid pathfinding, an old AI problem, is central for the development of navigation systems for autonomous agents. A surprising fact about the vast literature on this problem is that very limited neighborhoods have been studied. Indeed, only the 4- and 8-neighborhoods are usually considered, and rarely the 16-neighborhood. This paper describes three contributions that enable the construction of effective grid path planners for extended 2 k -neighborhoods. First, we provide a simple recursive definition of the 2 k -neighborhood in terms of the 2 k –1 -neighborhood. Second, we derive distance functions, for any k >1, which allow us to propose admissible heurisitics which are perfect for obstacle-free grids. Third, we describe a canonical ordering which allows us to implement a version of A* whose performance scales well when increasing k . Our empirical evaluation shows that the heuristics we propose are superior to the Euclidean distance (ED) when regular A* is used. For grids beyond 64 the overhead of computing the heuristic yields decreased time performance compared to the ED. We found also that a configuration of our A*-based implementation, without canonical orders, is competitive with the "any-angle" path planner Theta$^*$ both in terms of solution quality and runtime.
Ford To Invest $1B In Self-Driving AI
CHICAGO, IL - FEBRUARY 09: Ford introduces the 2018 Expedition at the Chicago Auto Show on February 9, 2017 in Chicago, Illinois. The auto show, which is the nation's largest, is open to the public February 11-20. Autonomous cars are still mostly in test-drive where our streets are concerned, but the race for this technology now has many major firms kicking their AI investment into high gear. Ford Motor Company announced Saturday that it intends to invest $1 billion in Argo AI, an artificial intelligence start-up led by some familiar industry faces, over the next five years. The company's aim is to develop a virtual driver system for the autonomous vehicle it plans to roll out in 2021, and--presuming the price is right--for use with other automakers'.